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ABSTBACT 

The capability was studied of each of three Models 
for producing indices that will reproduce school effectiveness 
rankings established a priori through simulation. The models used 
were a vithin**group regression technique, a regression Model using 
individual scores, and a regression lodel using aeans. Data for 54 
hypothetical schools on input, SES, school, and output variables vere 
randOBly generated from a multivariate normal distribution using 
parameters from previous studies. The results indicated that each 
type of model vas capable of producing indices vhich vere rather 
accurate reflections of the effectiveness ranks of schools. (Author) 
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^ An aspect of the school effectiveness question which has 

. received sose attention recently is the deteraination of the 
relative effectiveness of schools fros indices produced by statis- 
tical BodelSa Marco (1974) examined five such sodels which are 
based on lonsitudinal data (see also» Convey^ 1973)* Typically » 
the Models have been used with available data and conclusions have 
been nade as tc the relative effectiveness of the schools involved 
without the presentation of any evidence of the validity of the 
Mdels (Burke^ 1973; Dyer, Linn, 9 Patton, 1969; Forsyth, 1973; 
Marco, 1974). 

The purpose of this study was to exaaine the capabilities of 
three types of statistical models to produce indices that would 
reproduce school effectiveness rankings established a priori* 
Siaulated data were used in the study so that the paraseters could 
be aanipulfited in order to aake soae schools sore effective than 
others according to an established criterion. The use of simulated 
data then^ avoided the problem of relying on experts or consensus 
opinion to deternine the relative ranking of the schools. 
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^This papar is based on th.^ author's Ph.D. dissertation 
<9 submitted to the faculty of the Florida State University. 

A lumper presented at the annual reeting of the American 
Educational Research Association, Washington, March 30*April 3, 1975. 
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Method 

Variables 

The variables selected for the study were sinilar to those 
used in previous school effectiveness studies and were classified 
as output, input, socioeconoaic (SES)» und school variable Hie 
output and input variables were given the characteristics of the 
total aath score on tha Conprehensive Tests of Basic Skills (CTBS) 
for eifhth graders (Level 3, Fora R) and sixth graders (Level 3, 
PoraQ), respectively. The expanded standard score scale developed 
for all levels and foras of the CTBS- v«s used for the output and 
input scores (California Test Bureau/McGraw-Hill, 1970). The SBS 
variable was given the characteristics of the index used in the 
Talent Project as reported by Cooley and Lohnes (1971). The 
chara^.teristics of the scores on the Verbal Ability Test for 
Teachers, .as used in the Coleaan Report (Coleaan, Caapbell, Hobson, 
N^Fartland, Mood, Weinfeld, 5 York, 1966), were given to the 
aeasnre of the school variable. 
Models 

Three types of aodels were used to produce indices: 
1. Within-School Regression. For each school, a prediction equation 
was obtained froa the regression of the individual student output 
scores on individual student input and SES scores. The equation 
is: 

0* -.a b^ I ♦ bx SES (1) 
where, 0' is the predicted output for an individual; bt and bx 
are the least squares estiaates of the coefficients of input (I) 
asd SES within each school; and a is the least squares estiaate 
of the constant for the school. This aodel produces a unique 
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regression plane for each school. In general, these planes 
i«ill:not be parallel. :ience this aodel allows schools to be 
tested for differential effectiveness at various valm>s of I 
and SES. An effectiveness index is defined for each school at 
a specific coabination of predictor values (X^tSES^) m tht< 
predicted value at that point. Noraally, several such ordered 
pairs will be of interest. In addition to the two«>predictor 
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aodel, a »odel using input only as predictor was exaainedl 
2« Individual Regression Residuals. For the total group, a 
prediction equation was obtained from the regression of the 
individual output scores on the predictors X, SBS, and SV(schQol 
variable which was assumed constant for all individuals in a 
given school). The equation is: 

0» - p ♦ q^ I ♦ qjj SES ♦ q, SV (2) 

where/O* is the predicted output for an individual; 
are the lefst squares estimates of theC coefficients of the 
predictors based on the total group; and £^ is tho least squares 
estimate of the constant based on the. total group. In addition 
to the above three^predictor model (IRR), a twc-predictor model 
(IR2) using I and SES, and a one-predictor model (IRl) using 
I were examined. In each model, the residuals for individuals 
were obtained. The effectiveness index for each school was 
calculated by averaging the residuals within each schools 
3. School Regression Residuals. For the total group, a prediction 
equation was obtained from t> i regression of the mean output ^ 
for each school on the mean of each predictor, T, and SV, 
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for 0mch school. The aquation is: 

?y» - r ♦ 3a T ♦ s^ Sir ♦ SV (3) 

whore» 9* is the predicted mean output for the school; s^i»Slz» 
and £^ are the least squares estimates of the coefficients of 
the predictors when means are used; and r is the least squares 
estimate of the constant based on the means for each school. 
In addition to the above three-predictor model (SRR) » a two- 
predictor model (SR2) using T and and a one-predictor 
model (SRI) using T were examined. In each models the effective* 
ness index for each school is the residual obtained by subtracting 
the observed mean output Tf from the predicted mean output CT* 
Sample 

Data for S4 hypothetical schools were randomly generated for 
the four variables on the CDC 6500 computer at Florida State 
University using subroutine MSCORE. MSCORE generates random data 
from a multivariate normal distribution. The user specifies the 
means and standard deviations of, and the intercorrelations among, 
the variables. 

Insert Table 1 about here 

First, 54 ordered sets representing scores for output, input, 
SBS, and school variable were generated according to the specific 
cations shown. in Table 1. These specifications were consistent 
with results of previous studies. The 54 scores for each variable 
were assumed to constitute the population means for the variables 
in each group. There is considerable empirical evidence to 
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indicate that, for achievement tests, the standard deviation of 
the distribution of school Mans is fro« .3 to .6 of the standard 
deviation of scores in the total population, regardless of group 
size (Lindquist, 1930, 1966; Lord, 1959). Lord (1959) suggested 
that .4 would be a good approximation. This approximation was 
used in the specifications for the generation of the input, output, 
and SfiS variables. The individual score standard deviation was 
used in generating the 54 school variable scores, since this 
variable was assused to be constant within each group and naxinus 
variability across groups was desired. 

Second^ the 54 groups were ordered fros high to low on the 
input score. The top 18 were designated as high, the next 18 
aediun, and the lowest 18 as low. Within each category, six 
groups were randomly designated as effective, six as average, 
and six as less effective. Effectiveness was defined in terms 
of the gain from the input mean to the output mean on the standard 
score scalOc The input and output means were paired so as to 
satisfy the effectiveness criteria of effective (gain greater 
thaa 68), average (gain between 46 and 6li) , and less effective 
(gain less than 46). In this study, 68 units represented 
approximately one standard deviation on the input distribution. 
These criteria then appear to be consistent with previous studies 
(see, Coleman et al., 1966; Guthrie, 1970; Shaycoft , 1967) . 
Table 2 shows the characteristics of the ordered sets after the 
pairing. These values were considered to be reasonably close 
to the generating values in Table 1. 

Insert Table 2 about here 
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The input-output pairings were aade so that within each 
•ffactiveness classification, groups with high, sedium, and low 
inputs were equally represented. This procedure attempted to 
control for any bias that night be introduced by an overbalance 
of certain levels of inputs in any one classification. For 
exaaple, it is well known that gain scores usually have a negative 
bias with respect to ihe initial scores (O'Connor, 1972). Thus, 
low inputs would tend to show larger gains, and the effective 
category night have contained a disproportionate nunber cf low 
inputs had not the above procedure been adopted. 

Insert Table 3 about here 

Prior to the generation of individual scores within each 
school, group size was varied according to the plan shown in 
Table 3. This distribution is consistent with field results 
(Florida Ninth-Grade Testing Program, 1968). A table of random 
nuabers was used to implement this plan. Group size was uniformly 
distributed over the different effectiveness classifications. 
The total group consisted of 9087 individuals and group size 
ranged fro« 20 to 599. The resulting distribution is given in 
Table 4. 

Insert Table 4 about here 

Next, individual student scores were randomly generated 
within each group using MSCORE with the intercorrelations shown 
in Table 1, the means in Table 2, and the standard deviations of 
85.90 for output, 68.30 for input, and 9.42 for SES as parameters. 
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The score for the -.ehool variable assigned to each individual 
in each group was the nean for their respective group. 

After each group was forned, the saaple aean for each group 
was calculated. Soae reranking of the groups occurred as a 
result of these saaple values. It should be noted that these 
saaple values would be directly observable when exaaining real 
data. The a priori data and the a priori ranks would not be 
observable. 

Characteristics of Generated Data 

The general characteristics of the generated data are given 
in Table S. These values are reasonably close to the desired 
paraaeters. The negative correlations involving the school 
variable caused soae concern. Because of the large nuaber of 
eases, each of these correlations calculated froa individual 
scores would be statistically different froa the desired values 
at any reasonable level of significance. The school variable 
being constant within each group could have been a contributing 
factor to the negative correlations i However, it was concluded 
that the discrepancies noted were slight enough to continue with 
the applications of the aodels to the generated data. 

Insert Table S about here 

Finally, a characteristic of the generated data which aay 
have soae influence on the results should be noted. The range 
of correlations between input and output was .654 to .808, between 
input and SES, .209 to .589, and between output and SES, .263 to 
.621. These values appear to be within the bounds of saapling 
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variation; However, it aay be that in so«e actual settings, 
correlations outside of this range aay systematically occur for 
soM groups. This would influence the behavior of the regression 
equation for each group and nay effect the rankings produced by 
the Within-School Regression models for certain choices of 
predictor values. This limitation should be kept in mind when 
examining the results. 
Procedures 

The models were applied to the S4 groups and effectiveness 
indices wer^ calculated. For the two-predictor Hithin-School 
Regression model, the following combinations of predictor values 
were used: 

1) input mean (474) and SES mean (98.54); 

2) 1 o above the input mean (542.3) and 1 a above the 
SBS mean (107.96); 

3) I 0 below the input mean (405.7) and 1 cr below the 
SES mean (89.12). 

Each application was designated at WSRl,. WSR2, WSR3, respectively. 
Effectiveness indices were also calculated using the one-predictor 
Within-School Regression model with values at the mean, and one 
standard deviation above and below the mean, of the input. These 

* 

models were designated as WRl, WR2, and WR3, respectively. 

The ability of the models to produce accurate rankings was 
examined by obtaining correlations between the a priori ranks 
and the ranked indices produced by each model. In addition, 
selected pairwise comparisons between the correlations 
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associated with each model and the a priori ranks were obtained 
and tested for significance in order to determine if any one model 
was superior to the others in reproducing the a priori ranks. 
Simultaneous inference procedures were employed to prevent the 
compounding of a Type I error beyond a specified value. 

Nextf the ability of the models to discriminate between the 
18 effective groups and the 18 less effective groups was examined. 
For the Within^School Regression models^ one-*tailed Bomferroni 
confidence intervals were constructed on the prediction surface 
representing each of the 18 effective groups and 18 less effective 
groups (see Miller^ 1966). Each interval was constructed at a » 
.001, thus maintaining an overall experimental a less than .324 
for each model. The length of each interval was given hy: 

where, t^v^ is the one-tailed Student t^ value; £ is the estimated 
standard error for each surface; a is (l*Io»8ES^) for the two- 
predictor models and (1»Iq) for the one-predictor models; and 
(x^x)*^ is the value from the appropriate normal equations. 
For each of the six Within-School Regression models, each 
effective group was compared with each less effective group. A 
comparison was declared different if the respective .confidence 
intervals did not overlap. 

The ability of the residuals models to discriminate was 
investigated using '^Performance Indices*^ (Pis) as suggested by 
Dyer, Linn, and Patton (1967). For each model, the following 
ratio was calculated for each group: 
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realdual 

wh#rt» Sir is the average within^^roup standard deviation on the 
output for all groups » and n is the average group size« Pis 
vere then calculated hy the following rule: 

R < -l.S, PI ■ 1; 
-1.5 < R < - .5, PI - 2; 
- .5 < R < .5, PI ■ 3; (6) 
.5 < R < 1.5, PI • 4; 
1.5 < R , PI ■ 5. 

ftifferent decision rules to deteraine when two Pis are different 
were used. Forsyth (1973) suggested that tht^ criterion be at 
least 2 units. In addition, rules requiring at least 3 units 
and at least 4 units were exanined. 

Results and Discussion 
The ranks of the effectiveness indices produced hy each 
sodel are shown in Table 6 along with the a priori and the 
sample ranks. The effects of sampling can be seen by comparing 
the first two columns of Table 6. For the most part, the sample 
ranks were reasonably close to the a priori ranks. Only 11 of 
the 54 groups showed a discrepancy of more than 5 ranks and, of 
thasa^ only groups IS and 32 have discrepancies of more than 10 
units. These- discrepancies influenced the behavior of the models 
simee the models will reproduce the sample ranks more accurately 
than the a priori ranks. 



Insert Table 6 about hera 
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Exaaination of Table 6 revealed that the ranks assigned by 
each Bodel were rather consistent for aost groups. A striking 
consistency occurred anong the top 10 groups and the last 14 or 
15 groups. This seened to indicate that each model would be 
rather accurate for at least gross discriminations of effective 
froa less effective groups. 
Raproduction of A Priori Ranks 

The intercorrelations among the a priori ranks, the saaple 
ranks, and each of the model ranks are shown in Table 7. Each 
correlation is substantial and significant at a ■ .001. Since 
91 hypotheses were being considered simultaneously, the use of 
a Bonferroni strategy guaranteed that the overall a was not 
greater than .091. 

Insert Table 7 about here 

The correlations of the model ranks with the a priori ranks 
were highest for the one-predictor application of each model type. 
When SES was added as a predictor, each of the correlatioas 
decreased. Very little change occurred when the SV predictor 
was aaded. This same trend was present in the correlations of 
the model ranks with the sample ranks. 

The ' phenomenon of decreasing correlations between the model 
ranks and the a priori ranks was initially unexpected. Perhaps 
this phenomenon was due to the increased infli ^nce of random 
error on the r^nks as the number of predictors is increased. 
When a predictor is added, there is less error variation in the 
system. However, a larger proportion of that variation may be 
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due to randoa error. When the indices were ranked and correlated 
with the a priori ranks ^ these correlations may have diminished 
as the number of predicto:rs increased because of this increased 
role of random error. A similar phenomenon was also illustrated 
in data reported by Gastright (1974). Gas^ri^ht attributed this 
to over-^fitting the model to the data. rhcips, but the 
Increased influence of random error on the residuals as the 
number o.f predictors increased seems more plausible. 

Insert Table 8 about here 

The results of the significance tests of selected differences 
between each of the correlations of the model ranks with the a 
priori ranks are shown in Table 8. The most drastic reduction 
in the ability to reproduce the a priori ranks occurred in the 
School Regression Residuals model. The differences between the 
correlations with the one«-predictor SRI ranks wore significant 
at a • .005. A less stringent individual comparison level of 
•0& would have resulted in most of the other one-predictor models 
being declared different from their two- and three-predictor 
counterparts . 

The above evidence seems to indicate that basically different 
results are possible when models are used with varying number of 
predictors. If the groups are to be ranked on a given criterion 
which is influenced by all variables, the observed mean gain se^ms 
to be the best measure of effectiveness. The correlation between 
the sample ranks and the a priori ranks was .9587 in this study. 
If a ranking is desired where input is controlled, a model using 
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input as a predictor would seem to be appropriate. The 
similarity of the rankings from these one-predictor models to the 
a priori ranks will depend upon the relationship between the 
criterion predicted by input and the criterion used to establish 
the a priori ranks. If groups are to be ranked controlling for 
the influence of SES and input, then a model using both of these 
as predictors should be employed. Thus, it appears that the 
issue being illustrated here is one of proper model specification. 

Once the number of predictors is decided upon, the question 
stiir remains as to which type of model is superior. The 
correlations involving the different types of one-predictor 
models were very similar. These results are consistent with 
previous research using nonhypothetical data (see. Dyer et al . , 
1969; Karco, 1974). The results among the two-predictor models 
were somewhat less consistent. For example, the correlation of 
the SR2 ranks with the a priori ranks was significantly different 
from the correlation of the IR2 model ranks with the a priori 
ranks. However, the intercorrelations among the ranks of these 
two-predictor models ranged from .9685 tc .9855. The. a latter 
correlations are probably more represerlative of the agreement 
among the models than are the correlations of the model ranks 
with the a priori ranks, since they are comparing ranks based 
on similar criteria* Thus, the significant difference noted may 
have resulted' txom the high relationship between the models and 
may not indicate any superiority on the part of the IR2 model. 
The same was noted in contrasting the results of the three- 
predictor models. Hence, no conclusive '^vidence appears to be 
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available to indicate the superiority of any model type. 

Another trend evident from the examination of Table 7 was 
that, for the Within-School Regression models, higher correlations 
with the a priori ranks occurred in the models where prediction 
was wade about the means rather than at one standard deviation 
above or below the means. However, none of the comparisons was 
significant at a • .005 within either the one-predictor or two- 
predictor models. One significant difference was noted acress 
models when NRl and WR2 were compared. 

It should be recalled that each group received one a priori 
effectiveness rank which was based on all the individuals within 
the group. No attempt was made to manipulate the parameters so 
that some groups would be made more effective for individuals 
who had high or low predictor values. In nonhypothetical 
situations, some schools may be differentially effective for 
students at different levels of achievement or SES. The Within- 
School Regression models seem to be ideal for this type of 
situation. However, it was not intended to create this type of 
differential effectiveness in this study. Any variations in 
correlations between the model ranks and the a priori ranks were 
/ products of the models and sampling error, and not differential 
•effectiveness. 
Discrimination Ability 

Within-School Models . The confidence intervals calculated 
for each effective and less effective group for each of the 
Within-School Regression models varied considerably in length. 
The length of the intervals is a function of group size, the 
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standard error for each equation, and the point about which the 
interval was constructed. 

Insert Table 9 about here 

Table 9 5-ammarizes the number of significant comparisons 
between the effective and less effective groups for each Within- 
School Regression model. As expected, the models using means as 
comparison points (WSRl and WRl) had a higher number of signifi- 
cant comparisons than did the other models in the family. Two 
of the 36 groups in question exhibited atypical behavior. One 
showed only four significant results in the 108 possible compari- 
sons across the six models, the other showed only eleven. In 
both cases, sampling errors resulted in classifying the groups 
as average, thus making them more proximate to the comparison 
groups against which they were being contrasted. In addition, 
one group consisted of only 24 individuals. If these groups 
would be removed from the comparisons, the resulting discrimination 
accuracy of the rfSRl model would increase to 83.7% and the WRl 
model to 88.9%, with a corresponding reduction in the probability 
of a Type I error for all comparisons considered simultaneously 
from .324 to .289. 

The, ability of the Withir-School Regression models to 
discriminate between effective and less effective gi'oups when 
prediction is made at the means was quite good despite the 
rather stringent o of .001. The one-predictor models showed 
slightly better discrimination than the two-predictor models. 
Group size and the location of the comparison points influenced 
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the ability of the models to discriminate to a greater exteut 
than did the standard errors of the groups • 

Residuals Models ^ Table 10 shows the distribution of Pis 
over the a priori classifications for each model. Only the SRI 
model assigned an index of 5 exclusively to effective groups. 
Also, this same model assigned an index of 1 to 17 of the less 
effective groups. The distributions in the other models were 
rather similar. From the evidence presented in Table 10, it 
does not appear than one model is superior to the others in 
discriminating between effective and less effective groups using 
Pis. 

Insert Table 10 about here 

Table 11 shows the effects of different rules used to decide 
if two groups should be considered different on the basis of 
their Pis. For the aata in this study, a rule of at least a 
a difference of 2 units correctly identified almost all of the 
comparisons between the effective and the less effective groups. 
The models using individual scores were slightly more accurate 
than the models using mean scores. Both one-predictor models 
correctly identified all but one comparison. However, a number 
of incorrect decisions would have been made using this rule, 
both within categories and between categories. For example, 
one effective group would have been misclassified if the 
Individual Residuals models were used, and three effective 
groups, if the School Residuals models were used. Also, m^st 
of the groups in the average category would have been declared 
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ttore effective than almost all of the groups in the less 
effective category, despite the fact that the real differences 
between some of these groups lay not be large enough to warrant 
this discriaination • 



Insert Table 11 about here 



A more stringent decision rule requiring a difference of at 
least 3 units resulted in a percentage of the real differences 
being lost, however the number of misclassifi cations was likewise 
reduced. The most stringent rule of a difference of at least 4 
units resulted in at least 74% correct classifications on eash 
model with almost no misclassifications « This 74% compares 
favorably with the percentage of significant comparisons found 
with the ffSRl and WRl models when statistical procedures were 
employed at a rather low significance .level of .001. 

From the above, it appears that Pis are useful in discrimina- 
ting between groups which have rather large differences in 
effectiveness. Attempts to make fine discriminations would 
appear to be unwarranted. If it is desired to be able to discri- 
minate almost all of the effective groups from the relatively 
ineffective ones in order to examine more closely why they may 
be effective or ineffective, a decision rule requiring at least 
a difference of 2 PI units would seem to be most useful. If 
misclassifications are of concern, more stringent decision rules 
would be more appropriate. A. general strategy might be to use 
the most stringent rule initially to determine gross differences, 
and then apply the less strict rules in turn with increasing 
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caution. In this way, a rather good profile of the relative 
effectiveness of the schools involved should be obtained. 

Suaaary 

The results indicated that each type of model was capable 
of producing indices which were rather accurate reflections of 
the effectiveness ranks and classifications established a priori. 
No conclusive evidence was present to indicate that any one 
■odel was superior to the others in accomplishing this task. 
Since the School Regression Residuals models are easier to 
apply than the other models and the data for them are usually 
readily available, they could be considered superior in a cost- 
effectiveness sense. The use of Pis in conjunction with the 
School Regression Residuals models will enable appropriate 
discriminations to be made between most of the schools possessing 
differences in effectiveness. Different decision rules can be 
••ployed in accordance to their relative strictness in order 
to identify almost all of the schools possessing a certain 
degree of differential effectiveness. 

The Within-School Regression model seems to be most useful 
in a situation where it is suspected that the schools may be 
•differentially effective for students possessing different 
characteristics on the predictors used in the model.. This model 
will generally produce a different set of ranks for each combina- 
tion of predictor values. These ranks depend to a great extent 
on the sixes of the schools used and the location of the 
predictor va?ues relative to their respective neans. If schools 
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are not dlfferdntially effective for certain kinds of students, 
this model will yield results very similar to those produced 
by the other models. This model is difficult to apply since a 
regression equation must be obtained for each school and 
individual student data are required. 

The results also indicated that, as additional predictors 
were added to the models, the correlations between the ranked 
indices and the a priori ranks decreased. This could be due 
to random error playing an increased role in establishing the 
effectiveness indices as the residual variation in the models 
decreased. This probably is related to the restriction in 
range phenomenon. Therefore, results from models using a 
different number of predictors may not be directly comparable. 
As a result, proper model specification, either through theory 
or the results of previous research or personal insight, is 
deemed essential in attempting to determine the relative 
effectiveness of schools through the use of indices from 
statistical models. 
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Table 1 

Specifications for Population Group Means 



Variable Mean S.D. Correlations 



Input SES School 



Output 


539.00 


34.360 


.73 


.45 


.02 


Input 


474 .00 


27.320 




.45 


.02 


SES 


98.54 


3.770 






.03 


School 


23.14 


1.6 35 









Table 2 

Characteristics of Population Group Means 



Variable Mean S . 0. Correlations 



Input SES School 



Output 


535.37 


40.06 


.7772 .5329 


-.0237 


Input 


477. 70 


31.91 


.4135 


-.0200 


SES 

* 


98.21 


4.20 




-.1855 


School 


23.14 


1.54 
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Table 1 

Specifications for Population Group Means 



Variable Mean S.O. Correlations 



Input 8ES School 



Output 


5 39.00 


34.360 


.73 .45 


.02 


Input 


474.00 


27.320 


.45 


.02 


SES 


98.54 


3.770 




.05 


School 


23.14 


1.6 35 







Table 2 

Characteristics of Population Group Means 



Variab le Mean S . D . Correlations 



Input SES School 



Output 


535.37 


40.06 


.7772 .5329 


-.0237 


Input 


477. 70 


31 .91 


.4135 


-.0200 


SES 

* 


98.21 


4.20 




-.1855 


School 


23.14 


1.54 
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Table 3 

Plan for Distribution of Group Sizes 



Nuaber 


per Group 


Number of Groups 


20 


. 99 


18 

• 


100 


- 199 


18 


200 


- 299 


9 


300 


- 399 


9 



Table 4 
Group Sizes 



Group 


N 


jrOup 


N 


Group 


N 


Group 


N 


Group 


N 


Group 


N 


1 


166 


10 


18S 


19 


293 


28 


179 


37 


188 


46 


156 


2 


94 


11 


223 


20 


143 


29 


89 


38 


349 


47 


205 


S 


213 


12 


330 


21 


270 


30 


14S 


39 


24 


48 


150 


4 


62 


13 


104 


22 


S5 


31 


326 


40 


174 


49 


110 


5 


104 


14 


368 


23 


399 


32 


143 


41 


2S9 


SO 


337 


6 


69 


IS 


S4 


24 


23 


33 


127 


42 


289 


SI 


188 


7 


71 

> 


16 


37S 


2S 


241 


34 


48 


43 


32 3 


S2 


20 


8 


102 


17 


296 


26 


70 


3S 


333 


44 


47 


.53 


53 


9 


42 


18 


103 


27 


97 


36 


129 


4S 


69 


54 


75 
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Table 5 

Characteristics of Generated Data 



Variable Mean S.D. Correlations 





Based on the 


54 Sample Group 


Means 






• 


Input 


SES 


School 


Output 


S35.82 


41.99 .7695 


.4975 


.0154 


Input 


477.43 


32.86 


.3960 


.0577 


SES 


98.06 


4.17 




-.1013 


School 


23.14 


1.54 








Based on the 


9087 Individual 


Scores 








Input 


SES 


School 


Output 


536.36 


94.35 .7429 


.4577 


-.0130 


Input 


478 ,95 


74.59 


.4401 


..0302 


SES 


98.38 


10.09 




-.0359 


School 


23.24 


1.38 
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TABLE 6 

A Priori^ SanplA, And Hodml Ranks 



^2roup Swp VSRl WSR2 VSR3 


VRl 












CSV 






1 


2 


2 


3 


2 


2 


2 


2 


2 


2 


2 


2 


2 


2 


2 


1 


1 


2 


1 


1 


1 


1 




1 


1 


1 


1 


1 


3 


3 


4 


8 


3 


3 


3 


3 


4 


4 


% 


e 


6 


3 






O 


c 
9 


c 
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6 


6 


5 


5 


4 


5 
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3 
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6 
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3 


3 
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9 


10 


10 


11 
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11 


11 


11 


9 


14 


14 


9 


7 
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B 


9 


9 


7 


/ 7 


7 
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4 
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10 
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16 


16 


11 


23 


23 


11 


12 


14 


19 
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Table 9 

Sunnary of Confidence Interval Comparisons 



Model 


Significant Comparisons 


Percent Significant 


WSRl 


246 


75.9 


NSR2 


138 


42.6 


WSR3 


142 


43.8 


WRl 


, 262 


80.9 


WR2 


196 


60.5 


NR3 


185 


57. 1 
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Table 10 



Rttlatlonthlp Between Performance Indices. And A Priori 
Classifications For Each Residuals Model 



30 



A PriorT**"-^^ 


fcS > f 3 2 1 


Effective 
Average 

peas Effective 


15 0 2 1 0 
3 3 6 3 3 
0 1 1 0 16 




^"--^ SR2 
A Priwi"*^--^ 


5 13 2 1 


SfiAictive 

Average. 

Less Effective 


15 0 2 1 0 
3 3 6 3 3 
0 1 1 0 16 






A Pri^"^ — 


5 13 2 1 


Effective 
Average 

Less Effective 


15 2 1 0 0 
0 7 5 3 3 
0 0 0 1 17 





5 13 2 1 


Effective 
Average 

Less Effective 

1 1 


15 2 1 0 0 
3 5 5 2 3 
0 0 1 1 16 






A Prior?*'****^ 


5 13 2 1 


Effective 
Average 

Le&3 Effective 


16 1 1 0 0 
3 5 5 2 3 
C 0 1 1 16 






IRl 
A Priort*"— 


5 13 2 1 


Effective 
Average 

Less Effective 


15 2 1 0 0 
2 S 2 3 3 
0 0 0 1 17 
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Tablo 11 



Frequency of Diffsrences Under Three Decision Rules 



Models 


At Least 


2 Units 


At Least 


3 Units 


At Least 


4 Units 




£££3. 


Percent 


Freq 


Percent 


Freq 


Percent 


SRR 


302 


93.2 


240 


74.1 


2 40 


74.1 


SR2 


302 


93.2 


240 


74. 1 


240 


74.1 


SRI 


. 323 


99.7 


304 


93.8 


255 


78.7 


IRR 


320 


98.8 


287 


88.6 


2 40 


74.1 


IR2 


321 


99.1 


288 


88.8 


256 


79.0 


IRl 


323 


99. 7 


304 


93.8 


255 


78.7 
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